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Abstract. Highly coherent sensing matrices arise in discretization of continuum imaging 
problems such as radar and medical imaging when the grid spacing is below the Rayleigh 
threshold. 

Algorithms based on techniques of band exclusion (BE) and local optimization (LO) are 
proposed to deal with such coherent sensing matrices. These techniques are embedded in 
the existing compressed sensing algorithms such as Orthogonal Matching Pursuit (OMP), 
Subspace Pursuit (SP), Iterative Hard Thresholding (IHT), Basis Pursuit (BP) and Lasso, 
and result in the modified algorithms BLOOMP, BLOSP, BLOIHT, BP-BLOT and Lasso- 
BLOT, respectively. 

Under appropriate conditions, it is proved that BLOOMP can reconstruct sparse, widely 
separated objects up to one Rayleigh length in the Bottleneck distance independent of the 
grid spacing. One of the most distinguishing attributes of BLOOMP is its capability of 
dealing with large dynamic ranges. 

The BLO-based algorithms are systematically tested with respect to four performance 
metrics: dynamic range, noise stability, sparsity and resolution. With respect to dy- 
namic range and noise stability, BLOOMP is the best performer. With respect to sparsity, 
BLOOMP is the best performer for high dynamic range while for dynamic range near unity 
BP-BLOT and Lasso-BLOT with the optimized regularization parameter have the best per- 
formance. In the noiseless case, BP-BLOT has the highest resolving power up to certain 
dynamic range. 

The algorithms BLOSP and BLOIHT arc good alternatives to BLOOMP and BP/Lasso- 
BLOT: they are faster than both BLOOMP and BP/Lasso-BLOT and shares, to a lesser 
degree, BLOOMP's amazing attribute with respect to dynamic range. 

Detailed comparisons with existing algorithms such as Spectral Iterative Hard Thresh- 
olding (SIHT) and the frame-adapted BP are given. 



Reconstruction of a high-dimensional sparse signal from sparse linear measurements is a 
fundamental problem relevant to imaging, inverse problems and signal processing. 

Consider, for example, the problem of spectral estimation in signal processing. Let the 
uncontaminated signal y(t) be a linear combinations of s time-harmonic components 
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where Cj are the amplitudes. Suppose that y(t) is contaiminated by noise n(t) and the 
received signal is 

(2) b(t) = y(t)+n(t). 

The task is to find out the frequencies Q = {ujj} and the amplitudes {cj} by sampling b(t) 
at discrete times. 

A standard approach to spectral estimation is to turn the problem into the linear inversion 
problem as follows. To fix the idea, let tk,k = 1,...,N be the sample times in the unit 
interval [0, 1]. Set b = (b(t k )) G to be the data vector. We approximate the frequencies 
by the unknown closest subset of cardinality s of a regular grid Q = {p±, . . . ,Pm} and write 
the corresponding amplitudes as x = (xj) G C M where the components of x equal the 
amplitudes {cj} whenever the grid points are the nearest grid points to the frequencies {uj} 
and zero otherwise. Typically the number of approximating grid points is far greater than 
the number of frequencies, i.e. M ^> s. 

Let the measurement matrix be 

(3) A= [ ai ... a M ] eC NxM 
with 

(4) a, = J- (e" i2 ^) G C", j = 1, M. 

v A 



We cast the spectral estimation problem into the form 

(5) Ax + e = b 

where the error vector e = (e^) G is the sum of the external noise n = (n(tk)) and the 
discretization or gridding error d = (5k) G C N due to approximating the frequencies by the 
grid points in Q. By definition, the gridding error is given by 

(6) d = b — n — Ax. 

Small gridding error requires that the objects are a priori close to the grid points. The 
gridding error is related to basis mismatch analyzed in [12]. 

Sparse reconstruction with N, s <C M, where s is the sparsity of x, has recently attracted 
a lot of attention in various areas thanks to the breakthroughs in compressive sensing (CS) 
[HI [T7J EI]. The main thrust of CS is the ^-minimization principle, Basis Pursuit (BP) 
and Lasso, for solution characterization. Many L 1 -based algorithms as well as the alterna- 
tive, greedy algorithms, which are not directly based on global optimization, require either 
incoherence or Restricted Isometry Property (RIP) to have good performances. 

One commonly used characterization of incoherence in CS is in terms of the mutual co- 
herence jji,. Let the pairwise coherence between the k-th and j-th columns be 

(7) MMH^rrr 1 - 

l a fcl! a z| 

The mutual coherence of A is the maximum pairwise coherence among all pairs of columns 

(8) MA) = max/i(fc, I). 

According to theory of optimal recovery [16], for time sampling in [0,1], the minimum 
resolvable length in the frequency domain is unity. This is the Rayleigh threshold and we shall 
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Figure 1. The relative gridding error is roughly inversely proportional to the 
refinement factor. 



refer to this length as the Rayleigh length (RL). Hence for the traditional inversion methods 
to work, it is essential that the grid spacing in Q is no less than 1 RL. In the CS setting 
the Rayleigh threshold is closely related to the decay property of the mutual coherence [23J . 
Moreover, for Q C Z and uniformly randomly selected tk G [0, 1] the corresponding matrix 
A is a random partial Fourier matrix which has a decaying mutual coherence /i = 0(N' 1 ^ 2 ) 
and satisfies RIP with high probability [5J ED] • 

Without any prior information about the object support, the gridding error for the resolved 
grid, however, can be as large as the data themselves, creating a unfavorable condition for 
sparse reconstruction. To reduce the gridding error, it is natural to consider the fractional 
grid 

(9) Z/F = {3 IF : j e Z} 

with some large integer F 6 N which is the refinement factor. Figure [T] shows that the 
relative gridding error ||d||2/||b|| 2 is roughly inversely proportional to the refinement factor. 
The mutual coherence, however, increases with F as the near-by columns of the sensing 
matrix become highly correlated. 

Figure [2] shows the coherence pattern k)] of a 100 x 4000 matrix Q with F = 20 (left 
panel). The bright diagonal band represents a heightened correlation (pairwise coherence) 
between a column vector and its neighbors on both sides (about 30). The right panel of 
Figure [2] shows a half cross section of the coherence band across two RLs. Sparse recovery 
with large F exceeds the capability of currently known algorithms as the condition number 
of the 100 x 30 submatrix corresponding to the coherence band in Figure [2] easily exceeds 
10 15 . The high condition number makes stable recovery impossible. 

The difficulty with unresolved grids is not limited to the problem of spectral estimation 
in signal processing. Indeed, the issue is intrinsic and fundamental to discretization of PDE- 
based inverse problems such as remote sensing and medical imaging (TTJ [131 EZj- While 
Figure [2] is typical of the coherence pattern from discretization of one-dimensional problem. 
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Figure 2. Coherence pattern [fi(j, k)} for the 100 x 4000 matrix with F = 20 
(left). The off-diagonal elements tend to diminish as the row number increases. 
The coherence band near the diagonals, however, persists, and has the profile 
shown on the right panel where the vertical axis is the pairwise coherence and 
the horizontal axis is the separation between two columns in the unit of RL. 

In two or three dimensions, the coherent pattern is more complicated than Figure [2j Never- 
theless the coherence band typically reflects proximity in the physical space. The proximity 
between the object support and its reconstruction can be described by the Bottleneck or 
the Hausdorff distance [21 J. More generally, coherent bands can arise in sparse and redun- 
dant representation by overcomplete dictionaries (see Section [6] for an example). Under this 
circumstance, the Bottleneck or Hausdorff distance may not have a direct physical meaning. 

In any case, the hope is that if the objects are sufficiently separated with respect to the co- 
herence band, then the problem of a huge condition number associated with unresolved grids 
can be somehow circumvented and the object support can be approximately reconstructed. 

Under this additional assumption of widely separated objects, we propose in the present 
work several algorithmic approaches to recovery with unresolved grids and provide some 
performance guarantee for these algorithms. 

The paper is organized as follows. In Section [2] we introduce the technique of band ex- 
clusion (BE) to modify the Orthogonal Matching Pursuit (OMP) and obtain a performance 
guarantee for the improved algorithm, called Band-excluded OMP (BOMP). In Section [3] 
we introduce the technique of Local Optimization (LO) and propose the algorithms, Lo- 
cally Optimized OMP (LOOMP) and Band-excluded LOOMP (BLOOMP). In Section g] 
we introduce the technique of Band- Excluded Thresholding (BET), which comes in two 
form, Band-excluded Matched Thresholding (BMT) and Band-excluded Locally Optimized 
Thresholding (BLOT) and propose the algorithms, Band-excluded, Locally Optimized Sub- 
space Pursuit (BLOSP), Band-excluded, Locally Optimized CoSaMP (BLOCoSaMP), Band- 
excluded, Locally Optimized Iterative Hard Thresholding (BLOIHT) and BP/Lasso with 
BLOT (BP/Lasso-BLOT). In Section [5] we present numerical study of the comparative ad- 
vantages of various algorithms. In Section [6] we compare the performance of our algorithms 
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with the existing algorithms, Spectral Iterative Hard Thresholding (SIHT) and coherent- 
dictionary-based BP recently proposed in [19] and [5], respectively. We conclude in Section 

m 

2. Band Exclusion (BE) 

The first technique that we introduce to take advantage of the prior information of widely 
separated objects is called Band Exclusion and can be easily embedded in the greedy algo- 
rithm, Orthogonal Matching Pursuit (OMP) (151125] . 

First let us recall a standard performance guarantee for OMP [IB] . 

Proposition 1. Suppose that the sparsity s of the signal vectors, satisfies 
(10) /i(A)(2s-l) + 2^^ < 1 

where x min = min|a; fc | = \x s \. Denote by x ; the output of the OMP reconstruction. Then 

k 

suppix.) = supp(x.) 

where supp(x.) is the support o/x. 



In the ideal case where e = 0, (10) reduces to 
(11) //(A) < ! 



2s - 1 

which is near the threshold of OMP's capability for exact reconstruction of arbitrary objects 
of sparsity s. 

Intuitively speaking, if the objects are not in each other's coherence band, then it should 
be possible to localize the objects approximately within their respective coherence bands, no 
matter how large the mutual coherence is. 

Let us first define precisely the notion of coherence band. Let rj > 0. Define the ^-coherence 
band of the index k to be the set 

(12) B v (k) = {i\fM(i,k)>T,}, 

and the 77-coherence band of the index set S to be the set 

B V (S) = U keS B v (k). 

Due to the symmetry k) = fi(k,i),Wi, k, i G B v (k) if and only if k E B v (i). 
Denote 

(13) Bf(fc) = B T ,(B ri (k)) = U ieBv(k) B v (j) 

(14) BW(S) = B r ,(B v (S)) = U keS B^(k). 

To imbed BE into OMP, we make the following change to the matching step 
W = argminl (r"" 1 ,^) |, i £ B^{S n ~ l ), n = 1,2,.... 

meaning that the double 77-band of the estimated support in the previous iteration is avoided 
in the current search. This is natural if the sparsity pattern of the object is such that 
B v (J),j & supp(x) are pairwise disjoint. We call the modified algorithm the Band-excluded 
Orthogonal Matching Pursuit (BOMP) which is formally stated in Algorithm 1. 



Algorithm 1. Band-Excluded Orthogonal Matching Pursuit (BOMP) 
Input: A, b, ?7 > 

Initialization: x° = 0, r° = b and S° = 
Iteration: For n = 1, s 

1) Vax = argmini | (r"" 1 ,^) \,i <£ B?{S n - 1 ) 

2) S n = S"- 1 U { W } 

3) x n = argmin z ||Az — t» 1 1 2 s.t. supp(z) G S n 

4) r n = b - Ax n 
Output: x s . 



A main theoretical result of the present paper is the following performance guarantee for 
BOMP 

Theorem 1. Let x be s- sparse. Let rj > be fixed. Suppose that 

(15) B v (i) n Bf\j) = 0, Vz, j G supp(x) 
and t/iai 

(16) r/(5s _4)^ + |Mk<l 
where 

%max max UCm , ^min mm Xfc • 
fe k 

Let x 6e i/ie BOMP reconstruction. Then supp(x.) C B v (supp(x)) and moreover every 
nonzero component of x z's z'n the rj-coherence band of a unique nonzero component of x. 



Proof. We prove the theorem by induction. 

Suppose supp(x) = {Ji, . . . , J s }. Let J max G supp(x) be the index of the largest component 
in absolute value of x. 

In the first step, 



(17) 



|b*aj 

I 'J n 



xj +i|,a*,a f + ... + x j a* 7 a ( +e*a 7 



> x 



max ^max 



i)v - \\ e h 



by assumption (15). On the other hand, V/ ^ -B, ) (supp(x)), 

(18) |b*a ; | = l^a^a; + xj 2 a* j2 an + ... + x^a^a, + e*a;| 

< Xm^ST] + ||e|| 2 



by using (15) again. 



Hence, if 



(2s- l)r? + 2^ < 1, 

•^max 



then the right hand side of (17) is greater than the right hand side of (18) which implies that 



the first index selected by BOMP must belong to i?^(supp(x)). 
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Now suppose without loss of generality that the first (k — 1) indices Ji, ...,Ik-i selected by 
BOMP are in B v (Ji), Ji G supp(x),i = 1, k — 1, respectively. Write the residual as 

r*" 1 = b - c h a h - c l2 a l2 - ... - c/^a^. 
First, we estimate the coefficients c/ i; c/ fc _ 1 . Since (r^-^a^) = 0, 

c h = xj.a^a/, + xj 2 a.j 2 a h + ... + x Ja &j 3 *h + e * a /i - c / 2 a / 2 a /i - ••• - c 4-i a Li a ^i' 
which implies 

|c/J < x max + x max (s - l)r/ + ||e|| 2 + ^(|c/ a | + \c h \ + ... + |c Jfe _ 1 1) 
Likewise, we have 

(19) |cjj < x max + x max (s - l)r/ + ||e|| 2 + 77 [cj.|, j = 1, k — 1. 



Let c max = max cj, . Inequality (19) implies that 
j=i,...,fc-i J 



Cmax — ^max "I - -^max^ 1)^7 1 1 1 1 2 I 2)c ma; 

and hence 

Cmax ^ j- ?y( jfc 2~) ^ max *^' max ('^ I)'? "I - II ^11 2] 



Moreover, condition (16) implies that r](s — 1) < | and 1 _ ?? ^,_ 2 ^ < |. Hence 
. . 5 . 1 |. |. . 3 5 H | 

(20) C max ^(^max "I - g^max "I - ||e||2) 2^ max ^ll^ll^ - 

We claim that B^'(S k ~ 1 ) and {J k , J s } are disjoint. 

If the claim is not true, then there exists J i: for some i G {k, . . . , s}, Jj G B^\li) for some 

/ G {1, — 1}. Consequently, J, G -B^ (Jj) or equivalently B n (Ji) PI -B^ (Jj) 7^ which is 
contradictory to the assumption (15). 

Now we show that the index selected in the k-th. step is in B v ({J k , J s }). 

On the one hand, we have 

(21) k^ u ajj = \x Jl a* 1 a Ji + ... + xj g a i } 3 ^J i + e k aj. 

> Xmin ~ V( S - l^max ~ ||e|| 2 - T)(k - l)c max . 

One the other hand, we have that V/ £ Bj 1 2 \S k - 1 ) U B v ({J k , J,}), 

(22) |r fc_u a| = Ixj^X&i + ... + x Jk SL* Jk an + ... + arj.a^aj + e*a; 

< ^sx max + ||e|| 2 + r](k - l)c max 

in view of B v ({Ji, . . . , Jfe-i}) C B^'(S k ~ 1 ) as a result of the induction assumption. 
If the right hand side of (21) is greater than the right hand side of (22) or equivalently 

(23) V (2s + 3k - 4)^ + (2 + 5 - V (k - 1))M^ < 1 



then the k-th index selected by BOMP must be in B v ({Jk, J s }) because 



Bf)(s w )n{J fe ,...,J 8 } 



and because the fc-th selected index does not belong in B^\S k 1 ) according to the band- 
exclusion rule. Condition (16) implies (23) by setting the maximal k = s in (23) and noting 
that r](5s - 4) < 1 under (^6f . □ 

Remark 1. In the case of the matrix Q), if every two indices in supp(~x) is more than one 
RL apart, then 77 is small for sufficiently large N, cf. Figure^ 

When the dynamic range x max /x m i n = 0(1), Theorem^ guarantees approximate recovery 
of 0(q~ l ) sparsity pattern by BOMP. 

Remark 2. The main difference between Theorem l]an<i Proposition\^lies in the role played 
by the dynamic range x max /x m i n and condition (15). 

First, numerical evidence points to degradation in BOMP's performance for large dynamic 
ranges (Figure^). This is consistent with the prediction of (16). 

Secondly, condition (15) means that BOMP can resolve 3 RLs. Numerical experiments 
show that BOMP can resolve objects separated by close to 1 RL when the dynamic range is 
close to 1 (Figure^). 



3. Local Optimization (LO) 

As our numerical experiments show, the main shortcoming with BOMP is in its failure to 
perform even when the dynamic range is only moderate. 

To overcome this problem, we now introduce the second technique: the Local Optimization 
(LO). 

LO is a residual-reduction technique applied to the current estimate S k of the object 
support. To this end, we minimize the residual ||Ax — t» 1 1 2 by varying one location at a 
time while all other locations held fixed. In each step we consider x whose support differs 
from S n by at most one index in the coherence band of S n but whose amplitude is chosen 
to minimize the residual. The search is local in the sense that during the search in the 
coherence band of one nonzero component the locations of other nonzero components are 
fixed. The amplitudes of the improved estimate is carried out by solving the least squares 
problem. Because of the local nature of the LO step, the computation is not expensive. 



Algorithm 2. Local Optimization (LO) 
Input: A, b, r\ > 0, S° = {i±, . . . , ik}. 
Iteration: For n — 1, 2, k. 

1) x" = arg min z ||Az - b|| 2 ,supp(z) = (S^U^}) U {j n }, for some j n E B v ({i n }). 

2) S n = supp(x n ). 
Output: S k . 



Embedding LO in BOMP gives rise to the Band-excluded, Locally Optimized Orthogonal 
Matching Pursuit (BLOOMP). 
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Algorithm 3. Band-excluded, Locally Optimized Orthogonal Matching Pursuit (BLOOMP) 
Input: A, b, rj > 

Initialization: x° = 0, r° = b and S° = 
Iteration: For n = 1, s 

1) Vax = argmini | (r"" 1 ,^) \,i <£ B^^ 1 ) 

2) S n = LO(£ n_1 U {Vax}) where LO is the output of Algorithm 2. 

3) x n = argmin z ||Az — b|| 2 s.t. supp(z) G S n 

4) r n = b - Ax" 
Output: x s . 



We now give a condition under which LO does not spoil the BOMP reconstruction of 
Theorem [TJ 



Theorem 2. Let r] > and let x be a s-sparse vector such that (15) holds. Let S° and S k 
be the input and output, respectively, of the LO algorithm. 
If 



(24) x min >(e + 2(s - 1) V ) | y~ + \J (ff^ + 1 _ r] 2 

and each element of S° is in the 7]-coherence band of a unique nonzero component o/x, then 
each element of S k remains in the rj-coherence band of a unique nonzero component o/x. 

Proof. Because the iterative nature of Algorithm 2, it is sufficient to show that each element 
of S 1 is in the //-coherence band of a unique nonzero component of x. 
Suppose Ji G supp(x) and %\ G B v (Ji). Let 

r = min || Az — b|| 2 , supp(z) = (S^-fii}) U {Ji} 

z 

min || Az - b|| 2 , supp(z) = (S°\M) U {j}, J G B^i^B^Jt) . 



r 



We want to show that r < r',Wj G B v (ii)\B v (Ji) so that the LO step is certain to pick a 
new index within the ^-coherence band of Ji, reducing the residual in the meantime. For 
the subsequent analysis, we fix j G B v (ii)\B v ( Ji). 

Reset the J\ component of x to zero and denote the resulting vector by x'. Hence the 
sparsity of x' is s — 1. It follows from the definition of r that 

r < min || Az — Ax' — e|| 2 , supp(z) = {i 2 , . . . , ik}- 

z 

We also have 

r' = min ||A(z - x') - e + x^a.^ - ca.j\\ 2 , supp(z) = {i 2 , . . . , i k } } cGC 

z 

and hence by the law of cosine 



(25)r' > min w ||A(z — x') — e||| + Hx^a^ — ca,-||| — 2| (A(z — x') — e ) x Jl aij 1 — ca,-) 

z,c V 1 ' 

where supp(z) = {z 2 , . . . , i k }. 



Because of (15), j, J\ ^ £> r/ (supp(x') U {22, • • • , h})- By the definition of ^-coherence band, 
we have 

| (A(z -x') - e^j.a^ - ca^-) | < (e + r)(s + k - 2))(\x Jl \ + |c|) 



and hence by ( 25 ) 



r > min \/||A(z - x') - e||| + l^jj 2 + |c| 2 - 2r]\cx jj - 2(e + r](s + k - 2))(\x Jl \ + |c|) 

z,c V 

> . /min ||A(z — x') — e||| + min [|xjj 2 + |c| 2 — 2n\cxj 1 \ — 2(e + rj(s + k — 2))(\xj 1 \ + \c\ 

To prove r < r', it suffices to show 

min HxjJ 2 + |c| 2 - 2^(0X^1 - 2(e + rj(s + k — 2))(\x Jl \ + |c|)l 

cec 

= (l-r7 2 )|xj 1 | 2 -2(l + r/)(e + r7(s + A;-2))|xj 1 | - (e + r](s + k - 2)) 2 > 
which leads to the inquality 



\x A \ > (e + (s + k - 2)r]) I + J - 1 ' 



1 — ^ V (1 — v) 2 1 — r/ 2 J 
Considering the worst case scenario, we replace by x min and k by s to obtain the 



condition (24). 

□ 

Corollary 1. Lei x be the output of BLOOPM. Under the assumptions of Theorems^ and 
supp(x) C B v (supp(x)) and moreover every nonzero component o/x is in the r]-coherence 
band of a unique nonzero component o/x. 

Even though we can not improve the performance guarantee for BLOOMP, in practice 
the LO technique greatly enhances the success probability of recovery that BLOOMP has 
the best performance among all the algorithms tested with respect to noise stability and 
dynamic range (see Section [5]). In particular, the LO step greatly enhances the performance 
of BOMP w.r.t. dynamic range. 



4. Band-Excluded Thresholding (BET) 

The BE technique can be extended and applied to selecting s objects all at once in what 
is called the Band- Excluded Thresholding (BET). 

We consider two forms of BET. The first is the Band-excluded Matched Thresholding 
(BMT) which is the band-exclusion version of the One-Step Thresholding (OST) recently 
shown to possess compressed-sensing capability under incoherence conditions [TJ. 

For the purpose of comparison with BOMP, we give a performance guarantee for BMT 



under similar but weaker conditions than ( 15 )-( 16 ). 
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Algorithm 4. Band-excluded Matched Thresholding (BMT) 
Input: A,b,r] > 0. 
Initialization: S* = 0. 
Iteration: For k — 1, s, 

1) i k = argmax, | (b,a,-) | , V j $ B^{S k - 1 ). 

2) S k = S*- 1 U {z fc } 

Output x = argmin z ||Az — b|| 2 s.t. supp(z) C 5 s 

Theorem 3. Let x be s-sparse. Let rj > be fixed. Suppose that 

(26) B n (i) n B v (j) = 0, Wi,j e supp(x) 
and that 

(27) r ? (2 S -l) X ^ + 2| ^<l 
where 

^max max , ^min mm \X k . 
k k 

Let x be the BMT reconstruction. Then supp(x) C B^supp^.)) and moreover every nonzero 
component o/x is in the rj-coherence band of a unique nonzero component o/x. 

Proof. Let supp(x) = { Ji, . . . , J s }. Let J max G supp(x) be the index of the largest component 
of x in absolute value. 

On the one hand, for k — 1, s, 

(28) |b*a fe | = |xia*a fc + ... + x fc _iafc_ 1 a fc + x fc + a; fc+ ia^ +1 a fc + ... + x s a*a fc + e*a fc | 

— -^min (^ l)^^'max ||^ 1 1 - 

On the other hand, V/ ^ 5^(supp(x)), 

(29) |b*a/| = |^ia*a/ + a^a^ + ... + x s a*a z + e*a z | 

< a; max s?7 + ||e|| 2 . 



Therefore, the condition (27) implies that the right hand side of (28) is greater than the 



right hand side of (29). This means |b*afc| > |b*a/|, Vfc = 1, ..,s, V7 G" -B^(supp(x)) and hence 



the s highest points of |b*a fc | are in 5 J? (supp(x)). 



From step ii) of BMT and (26) it follows the second half of the statement, namely every 



nonzero component of x is in the //-coherence band of a unique nonzero component of x. □ 



Condition (26) roughly means that the objects are separated by at two RLs which is 
weaker than (15). In numerical simulations, however, BOMP performs far better than BMT. 
In other words, BMT is not a stand-alone algorithm but should instead be imbedded in other 
algorithms such as Subspace Pursuit (SP) jT4] . the Compressive Sampling Matching Pursuit 
(CoSaMP) [28] and the Normalized Iterative Hard Thresholding (IHT) [3]. This gives rise 
to Band-excluded Subspace Pursuit (BSP), Band-excluded Compressive Sampling Matching 
Pursuit (BCoSaMP) and Band-excluded Normalized Iterative Hard Thresholding (BNIHT) 
which we demonstrate their performance numerically. 

In addition to BMT, the second form of BET, namely the Band-excluded, Locally Opti- 
mized Thresholding (BLOT), can further enhance the performance in reconstruction with 
unresolved grids. 
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Algorithm 5. Band-excluded, Locally Optimized Thresholding (BLOT) 
Input: x = (xi, . . . , Xm), A, b, rj > 0. 
Initialization: S* = 0. 
Iteration: For n — 1, 2, s. 

1) z n = arg minj \xj\,j £ B^\S n ~ l ). 

2) S n = S™- 1 U {in}. 

Output: x = argmin ||Az — t» j 1 2 , supp(z) C LO(S* s ) where LO is the output of Algorithm 2. 

Now we state the algorithm Band-excluded, Locally Optimized Subspace Pursuit (BLOSP). 
The Band-excluded Locally Optimized CoSaMP (BLOCoSaMP) is similar and omitted here. 



Algorithm 6. Band-excluded, Locally Optimized Subspace Pursuit (BLOSP) 
Input: A, b, rj > 0. 
Initialization: x° = 0, r° = b 
Iteration: For n = 1,2, 

1) S n = supp(x n " 1 ) U supp(BMT(r n - 1 )) 

where BMT(r n_1 ) is the output of Algorithm 4 with data r n_1 . 

2) x" = argmin || Az — lo 1 1 2 s.t. supp(z) C S n . 

3) S n = supp(BLOT(x n )) where BLOT(x n ) is the output of Algorithm 5. 

4) r n = min z || Az — b|| 2 , supp(z) C S n . 

5) If ||r" _1 || 2 < e or ||r™|| 2 > ||r n_1 || 2 , then quit and set 5* = S 1 " -1 ; otherwise continue iteration. 
Output: x = argmin z ||Az — b|| 2 s.t. supp(z) C S. 



Embedding BLOT in NIHT turns out to have a nearly identical performance to embedding 
BLOT in the Iterative Hard Thresholding (IHT) [2]. Since the latter is simpler to implement 
and more efficient to compute, we state the resulting algorithm, the Band-excluded, Locally 
Optimized IHT (BLOIHT), below. 



Algorithm 7. Band-excluded, Locally Optimized Iterative Hard Thresholding (BLOIHT) 
Input: A, b, 77 > 0. 
Initialization: x° = 0, r° = b. 
Iteration: For n = 1,2, 

1) x™ = BLOT(x"~ 1 + A*^ 1 ) where BLOT denotes the output of Algorithm 5. 

2) If 1 1 r ^ 1 1 1 2 — e or || r "1|2 > ||r n_1 ||2, then quit and set S = S 1 " -1 ; otherwise continue iteration. 
Output: x. 

In addition, the technique BLOT can be used to enhance the recovery capability with 
unresolved grids of the L 1 -minimization principles, Basis Pursuit (BP) 

(30) minHzlli, subject to b = Az. 

z 

and the Lasso 

(31) min -||b — Az|| 2 + Acr||z||i, 

z 2 
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where a is the standard deviation of the each noise component and A is the regularization 
parameter. In this case, BLOT is applied to the BP and Lasso estimates x to produce a 
s-sparse reconstruction. The resulting algorithms are called BP-BLOT and Lasso-BLOT, 
respectively. 

The thresholded Lasso, the Lasso followed by a hard thresholding, has been considered 
previously (see |26j and references therein). The novelty of our version lies in the BE and 
LO steps which greatly enhance the performance in dealing with unresolved grids. 

5. Numerical study 
We test our various band-exclusion algorithms on the matrix 

(32) -= , k = 1, N, I = 1, RF 

where is uniformly and independently distributed in (0, 1). When F = 1, A is the random 
partial Fourier matrix analyzed in [30] and, with sufficient number of samples, successful 
recovery with A is guaranteed with high probability. For large F, however, A has a high 
mutual coherence. Unless otherwise stated, we use N = 100, M = 4000 and F = 20 and the 
i.i.d. Gaussian noise e ~ N(0, a 2 I) in our simulations. Recall the decay profile of pairwise 
coherence as the index separation increases in Figure [2] (right). The ^-coherence band is 
about 0.7 RL in half width with rj = 0.3. 

We compare performance of various algorithms in terms of success probability versus 
dynamic range, noise level, number of measurements and resolution. Unless otherwise stated, 
we use in our simulations 10 randomly phased and located objects, separated by at least 3 
RLs. A reconstruction is counted as a success if every reconstructed object is within 1 RL of 
the object support. This is equivalent to the criterion that the Bottleneck distance between 
the true support and the reconstructed support is less than 1 RL. 

For two subsets A and B in M. d of the same cardinality, the Bottleneck distance ds{A, B) 
is defined as follows. Let M. be the collection of all one-to-one mappings from A to B. Then 

cIb(A, B) = minmax la — f(a)\. 

feM aeA 

For subsets in one dimension, the Bottleneck distance can be calculated easily. Let A = 
{ax, . . . , a n } and B — {b\, . . . , b n } be listed in the ascending order. Then 

ds{A, B) = max |a,- — bj\. 

3 

In higher dimensions, however, it is more costly to compute the Bottleneck distance [2T1 124"] . 
The Bottleneck distance is a stricter metric than the Hausdorff distance which does not 
require one-to-one correspondence between the two target sets. 

In the first set of experiments, we test various greedy algorithms equipped with the BE 
step (only). This includes BOMP, Band-excluded Subspace Pursuit (BSP), Band-excluded 
CoSaMP (BCoSaMP) and Band excluded Normalized Iterative Hard Thresholding (BNIHT). 
For comparison, we also show the performance of OMP without BE. 

As shown in Figure [3] BOMP has the best performance with respect to dynamic range, 
noise and, in the case of higher dynamic range (> 3, bottom right panel), number of mea- 
surements. In the case of dynamic range equal to 1, BSP is the best performer in terms of 
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Figure 3. Success probability of various band-excluded algorithms versus 
dynamic range for 0% (top left) and 5% (top right) noise, versus noise level 
for dynamic range 1 (middle left) and 5 (middle right) and versus number of 
noiseless measurements for dynamic range 1 (bottom left) and 5 (bottom 
right). 
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Figure 4. Success probability versus dynamic range with 1% (left) and 3% 
noise (right). 



number of measurements followed closely by BCoSaMP and BNIHT (bottom panel). In the 
case of dynamic range equal to 1, BOMP and OMP have almost identical performance with 
respect to noise (middle left) and number of measurements (bottom left). The performance 
of BSP and BCoSaMP, however, depends crucially on the BE step without which both SP 
and CoSaMP fail catastrophically (not shown). 

In the next set of experiments, we test BLO-equipped algorithms. For the purpose of 
comparison, we also test the algorithm, Locally Optimized OMP (LOOMP) which is the 
same as Algorithm 3 but without BE. 

Lasso-BLOT is implemented with the regularization parameter 

(33) A = O^logM 
or 

(34) A = v^logM 

which is proposed in [10]. Other larger values have been proposed in [6] |7|. Our numerical 



experiments indicate that for matrix (32) with large F the choice (33) is nearly optimal 



among all A / y/log M < 10 and relative noise up to 5%. The superiority of the choice (33) 



to (34) (and other choices) manifests clearly across all performance figures involving both of 
them. 

Figure [4] shows success probability versus dynamic range in the presence of noise. The top 
performers are LOOMP and BLOOMP both of which can handle large dynamic range. In the 
noiseless case, the success rate for LOOMP, BLOOMP, BLOSP, BLOOMP and BLOCoSaMP 
stays near unity for dynamic range up to as high as 10 14 . With 1% noise (left panel), BLOSP, 



BLOCoSaMP and BLOIHT perform better than Lasso-BLOT with either (33) or (34) while 



with 3% noise (right panel), BLOSP, BLOCoSaMP and BLOIHT performance curves have 



dropped below that of Lasso-BLOT with (33). But the noise stability of Lasso-BLOT never 



catches up with that of LOOMP/BLOOMP even as the noise level increases as can be seen 
in Figure |5j 
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Figure 5. Success probability versus relative noise level with dynamic range 
1 (top) and 5 (bottom). 



Figure [5] shows that LOOMP and BLOOMP remain the top performers with respect to 
noise while Lasso-BLOT with (34) has the worst performance. Lasso-BLOT with (33), 
however, is a close second in noise stability. As seen in Figures [4] and [5j the performance of 
Lasso-BLOT depends significantly on the choice of the regularization parameter. 



With respect to number of measurements (Figure |6J), BP/Lasso-BLOT with (33) is the 
best performer, followed closely by BLOSP and BLOCoSaMP for dynamic range 1 (left 
panels) while for dynamic range 10, BLOOMP and LOOMP perform significantly better 
than the rest (right panels). As clear from the comparison of the top left and right panels 
of Figure [6j at low level of noise the performance of BLOOMP and LOOMP improves 
significantly as the dynamic range increases from 1 to 10. In the meantime, the performance 
of BLOSP, BCoSaMP and BLOIHT improves slightly while the performance of BP/Lasso- 
BLOT deteriorates. At 10% noise, however, the performance of BLOOMP and LOOMP is 
roughly unchanged as dynamic range increases while the performance of all other algorithms 
deteriorates significantly (bottom right). 

Next we compare the resolution performance of the various algorithms for 10 randomly 
phased objects of unit dynamic range in the absence of noise. The 10 objects are consecu- 
tively located and separated by equal length varying from 0.1 to 3 RLs. The whole object 
support is, however, randomly shifted for each of the 100 trials. For closely spaced objects, 
it is necessary to modify the band exclusion and local optimization rules: If h is the object 
spacing, we use h/2 to replace 2 RLs of the original BE rule and 1 RL of the original LO 
rule. 

Figure [7] shows the averaged Bottleneck distance between the reconstruction and the true 
object support (top panels) and the residual (bottom panels) as a function of the object 
spacing. For this class of objects, BP-BLOT has the best resolution for dynamic range up to 
10 followed closely by BLOIHT for dynamic range 1 and by BLOOMP/LOOMP for dynamic 
range 10. The high precision (i.e. nearly zero Bottleneck distance) resolution ranges from 
about 1.5 RLs for BP-BLOT to about 1.7 RLs for the rest. Consistent with what we have 
seen in Figure |6j the resolving power of BLOOMP/LOOMP improves significantly as the 
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Figure 6. Success probability versus the number of measurements with dy- 
namic range 1, 0% noise (top, left), dynamic range 1, 10% noise (bottom, left), 
dynamic range 10, 0% noise (top, right) and with dynamic range 10, 10% noise 
(bottom, right). 



dynamic range increases while that of BP-BLOT deteriorates. Note that in the case of unit 
dynamic range, BOMP recovers the support as well as BLOOMP/LOOMP does. BOMP, 
however, produces a high level of residual error even when objects are widely separated. 
There is essentially no difference between the resolving power of BLOOMP and LOOMP. 

It is noteworthy that the relative residuals of all tested algorithms peak at separation 
between 1 and 1.5 RLs and decline to zero as the object separation decreases. In contrast, the 
average Bottleneck distances increase as the separation decreases except for BP-BLOT when 
the separation drops below 0.5 RL. When the separation drops below 1 RL, the Bottleneck 
distance between the objects and the reconstruction indicates that the objects are not well 
recovered by any of the algorithms (top panels). The vanishing residuals in this regime 
indicates nonuniqueness of sparse solutions. 

Figures |4](7] show negligible difference between the performances of LOOMP and BLOOMP. 
To investigate their distinction more closely, we test the stability with respect to the gridding 
error for various F's (cf. Figure [T]). We consider randomly phased objects of dynamic range 
10 that are randomly located in [0, 1000) and separated by at least 3 RLs. We compute the 
reconstruction error in the Bottleneck distance and L 2 -norm averaged over 100 trials with 5% 
external noise and show the result in Figure [8j Evidently the advantage of BLOOMP over 
LOOMP lies in the cases when the refinement factor F is less than 10 and the gridding error 
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Figure 7. The average Bottleneck distance for dynamic range 1 (top left) 
and 10 (top right), and the relative residual for dynamic range 1 (bottom left) 
and 10 (bottom right) versus separation of objects 



is sufficiently large. When F > 10, the difference between their performances with respect 
to gridding error is negligible. On the other hand, for F = 5 both BLOOMP and LOOMP's 
reconstructions would have been considered a failure given the magnitudes of error in the 
Bottleneck distance. 



6. Comparison with other algorithms in the literature 

The present work is inspired by the performance guarantee established in [22J that the 
MUSIC algorithm aided by BMT produces a support estimate that is within 1 RL of the 
locations of sufficiently separated objects. 

In comparison to other CS literature on coherent and redundant dictionary, our work 
resembles those of [5] and [T2|, albeit with a different perspective. In fact, the algorithms 
developed in [5] and [TS] can not be applied to the spectral estimation problem formulated 
in the Introduction. The algorithms developed here, however, can be applied to their frame- 
based setting and this is what we will do below for the purpose of comparison. 
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Figure 8. The reconstruction error of LOOMP and BLOOMP as measured 
by the Bottleneck distance (left, semi-log) and the relative L 2 -norm (right) for 
10 objects of dynamic range 10 function of F. 



Following [Tj5] we consider the following problem 

(35) b = $y + e 

where $ is a iV x R i.i.d Gaussian matrix of mean and variance a 2 . The signal to be 
recovered is given by y = \I/x where ^ is the over-sampled, redundant DFT frame 

(36) y k>j = JL e - 2 * iik =^ =il , k = l,...,R, j = l,...,RF. 

V R 



As before F is the refinement factor. Combining (35) and (J36| ) we have the same form ^ 
with A = whose coherence pattern is shown in Figure [9] similar to Figure [2j 

In the simulation, we take N = 100, R = 200, F = 20 and a = -4= so that A e C 100x400 ° 
as before. We use randomly located and phased x = (xj) which are well separated in the 
sense that \xj — Xk\ > 3, Wj ^ k. 

The algorithm proposed in [19], Spectral Iterative Hard Thresholding (SIHT), relies on 
a measurement matrix $ satisfying some form of RIP and so does the frame-adapted BP 
proposed in [5] 

(37) min||**z||i s.t ||$z - b|| 2 < ||e|| 2 

which, in addition, requires the analysis coefficients ^*y to be s-sparse or s-compressible. 
The RIP is satisfied by i.i.d. Gaussian matrices of proper sizes. Their approaches are not 
applicable, however, when the sensing matrix A can not be decomposed into a product 
where $ has RIP. 

In the language of digital signal processing (37) is a /^-analysis method while the standard 
BP or Lasso and the BLOT-version are L 1 -synthesis methods. Both methods are based on 
the same principle of representational sparsity. In principle, the synthesis approach (such as 
the BP, Lasso and all BLO-based algorithms) is more general than the analysis approach since 
every analysis method can be recast as a synthesis method while many synthesis formulations 
have no equivalent analysis form. In practice, however, each of them performs the best on 
different types of signals [20] . 

19 




Figure 9. The coherence pattern of A = 



For example, the bottom right panel of Figure 10 shows the absolute values of the compo- 
nent of the vector SE^y in the order of descending magnitude. Clearly is neither sparse 
nor compressible. The shape of the curve can be roughly understood as follows. The DFT 
frame \I/ has a coherence band of roughly 1.5 RLs, corresponding to 30 columns, and since 
the 10 components in x are widely separated *f?*y has about 300 significant components. 
The long tail of the curve is due to the fact that the pairwise coherence of St* decays slowly 
as the separation increases. Therefore the analysis approach (37) would require a far higher 
number of measurements than 100 for accurate reconstruction. 

For the analysis approach, the main quantity of interest is y. So in our comparison 
experiments, we measure the performance by the relative error ||y — y 1 1 2 / 1 1 y 1 1 2 as m [51, [TP"]. 
We compute the averaged relative errors as dynamic range, noise level and the number of 
measurements vary. In each of the 100 trials, 10 randomly phased and located objects (i.e. 
x) of dynamic range 1 and i.i.d. Gaussian $ are generated. 



The results are shown in Figure 10. Consistently across the top left, top right and bottom 



left panels, the smallest error is achieved by BP-BLOT and Lasso-BLOT with (33) with 
respect to dynamic range (top left), noise (top right) and number of measurements (bottom 
left). The latter two plots are for dynamic range 1. BLOOMP and BLOSP perform among 
the best with respect to dynamic range (top left) and noise (top right) and can achieve the 
minimum error with increasing number of measurements (bottom left). The SIHT algo- 
rithm requires much higher number of measurements to get its error down (bottom left) and 
produces the highest level of error w.r.t. dynamic range (top left) and noise (top right). 

In Figure 10, we include the performance curves of OMP with various sparsities (s, 2s, 
5s) as well as the standard BP/Lasso without BLOT. Not surprisingly, the relative error of 
OMP reconstruction decreases with increasing sparsity. 

It is noteworthy that the BLOT technique reduces the BP/Lasso reconstruction errors: 
BP without BLOT produces 0.5% relative error with respect to dynamic range (top left, not 
clearly visible) while BP-BLOT, along with BLOSP and BLOOMP, produces 10" 16 relative 
error. Moreover, Lasso with the optimized parameter (33) but without BLOT produces 
significantly higher errors than Lasso-BLOT, BLOOMP and BLOSP (top right). 
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Figure 10. Relative errors versus dynamic range (top left), relative noise 
(top right) and number of measurements (bottom left). The bottom right 
panel shows the magnitudes of a realization of the coefficient vector \l/*y in 
the descending order of magnitude. 



7. Conclusions and Discussions 

We have developed and tested various algorithms for sparse recovery with highly coherent 
sensing matrices arising in discretization of imaging problems in continuum such as radar 
and medical imaging when the grid spacing is below the Rayleigh threshold [16j. 

We have introduced two essential techniques to deal with unresolved grids: band exclusion 
and local optimization. We have embedded these techniques in various CS algorithms and 
performed systematic tests on them. When embedded in OMP, both BE and LO steps 
manifest their advantage in dealing with larger dynamic range. When embedded in SP, 
CoSaMP, IHT, BP and Lasso the effects are more dramatic. 

We have studied these modified algorithms from four performance metrics: dynamic range, 
noise stability, sparsity and resolution. With respect to the first two metrics (dynamic range 
and noise stability), BLOOMP is the best performer. With respect to sparsity, BLOOMP 
is the best performer for high dynamic range while for dynamic range near unity BP-BLOT 
and Lasso-BLOT with the optimal regularization parameter have the best performance. 
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BP-BLOT also has the highest resolving power up to certain dynamic range. Lasso-BLOT's 
performance, however, is sensitive to the choice of regularization parameter 

One of the most surprising attributes of BLOOMP is improved performance with respect 
to sparsity at larger dynamic range and low noise. 

The algorithms BLOSP, BLOCoSaMP and BLOIHT are good alternatives to BLOOMP 
and BP/Lasso-BLOT: they are faster than both BLOOMP and BP/Lasso-BLOT and shares, 
to a lesser degree, BLOOMP's desirable attribute with respect to dynamic range. 

Comparisons with existing algorithms (SIHT and frame-adapted BP) demonstrate the 
superiority of BLO-enhanced algorithms for reconstruction of sparse objects separated above 
the Rayleigh length. 

Finally to add to the debate of analysis versus synthesis [20], the performance of BLO- 
based algorithms for sparse, widely separated objects are independent of the refinement 
factors F representing redundancy, and, since the discretization error decreases with F, the 
reconstruction errors of the BLO-based synthesis methods also decrease with F in stark 
contrast to the examples presented in [20J which show that the synthesis approach degrades 
with redundancy. 
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